Cluster analysis and PCA for modeling population structure

نویسنده

  • Diana Luca
چکیده

Case-control studies for association are widely used for finding genetic variants causally associated with phenotypes. Unfortunately, population structure can induce false positives. For instance, if cases and controls have different genetic backgrounds, differences in frequencies of distinct forms of variants might be due to differences in ancestral population of origin. Traditional approaches to control for the effects of population stratification include eigen-analysis, cluster analysis and matching based on genetic markers, are employed to improve the modeling of structure. Our approach goes further in that we show how to systematically obtain optimal matching and how to determine outlying subjects that cannot be successfully matched to others in the available registry. Simulations and an application to real data show improved results applying the new method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analysis of genetic diversity, phylogenetic relationships and population structure of Arasbaran cornelian cherry (Cornus mas L.) genotypes using ISSR molecular markers

Cornelian cherry (Cornus mas L.), considered as the ancestor of cultivated trees in Arasbaran region, is a medicinally and economically plant species. However, little is known about genetic diversity, breeding programs, and population structure of this species in mentioned region. Keeping this in view, the main objectives of present study were to analysis the genetic diversity, phyloge...

متن کامل

Analysis of physiochemical and microbial quality of waters of the Karkheh River in southwestern Iran using multivariate statistical methods

Rapid population growth as well as agricultural and industrial development have increased the contamination of Iranian rivers. This study utilized principal components analysis (PCA) to determine the degree of significance of qualitative parameters of water resources in the Karkheh River in southwestern Iran. Cluster analysis (CA) grouped the monitoring stations based on the water quality data ...

متن کامل

EVALUATION OF BIODIVERSITY OF FIELD BINDWEED POPULATION IN VARAMIN

Many polymorphisms exist in field bindweed (Convolvulus arvensis L.) populations. These differences that appear especially in their flower and leaf cause to different biotypes of this weed that are different in view of dry weight, rate- and flowering period. This research was accomplished during 2005 to 2006 at Weed Research Department, Iranian Plant Protection Research Institute for identifica...

متن کامل

Prioritizing Effective Factors in the Making Ethical Organizations by Using Combined Method of Interpretative Structural Modeling (ISM) and Principal Component Analysis (PCA)

Nowadays Organizations consider ethical principles in the business environment as an advantage and seek to strengthen it. This requires a coherent, interactive and cognitive understanding of the parts of internal and external environment of organization, which leads to the realization of the rights of the beneficiaries of the organization. The purpose of this paper is prioritize  the factors in...

متن کامل

Energy Consumption Modeling in Activated Sludge Process Using Coupling PCA-ANFIS Approach

The main challenge in Wastewater Treatment Plants (WWTP) by activated sludge process is the reduction of the energy consumption that varies according to the pollutant load of influent. However, this energy is fundamentally used for aerators in a biological process. The modeling of energy consumption according to the decision parameters deemed necessary for good control of the active sludge ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007